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A METHOD AND APPARATUS FOR ADAPTIVE ENCODING FRAMED DATA 

SEQUENCES 

FIELD OF THE INVENTION 

The invention relates to encoding methods and apparatus for encoding farmed 
data such as video encoding methods, more in particular to rate control in low cost emd 
large-scale implementation of video coding systems. 

BACKGROUND OF THE INVENTION 

Data as often transmitted through data networks in the form of a sequence or 
stream of packets or frames, i.e. a sequence of discretised data bundles, each bundle 
having a specific data content. One well-known example of such data streams is a 
sequence of video frames. These data streams may be encoded especially for reduction in 
the data rate of the stream-as-transmitted. The reduction of data rate is often necessary in 
order to reduce the bandwidth of the transmission channel used to transmit the data. 
Generally, these streams must be encoded and decoded in real-time. This places 
limitations on the amotmt of memory and on the processing capacity of the devices used 
to encode and decode. 

A video information stream comprises of a time sequence of video frames. Said 
time sequence of video frames can be recorded for instance by a video camera/ recorder or 
may be sent from memory or created artificially or S3mthetically. Each of said video frames 
can be considered as a still image. Said video frames are represented in a digital system as 
an array of pixels. Each pixel may be defined by a set of characteristics of the data in the 
pixel, e.g. each pixel may comprise luminance or light intensity and chrominance or color 
information. For a recent review of luminance and chrominance see "Colour Image 
Processing" by Sanwine, Electroiucs & Commtmication Journal, vol. 12, No. 5, October 
2000, pages 211 to 219. 

The information associated with each pixel is stored in a memory of said digital 
system. For each pixel some bits are reserved. From a programming point of view each 
video frame can be considered as a two-dimensional data type, although said video frames 
are not necessary rectangular. Note that fields firom an interlaced video time sequence can 
also be considered as video frames. 


In principle when said video information stream must be transmitted between two 
digital systems, this can be realized by sending the video frames sequentially in time, for 
instance by sending the pixels of said video frames and thus the bits representing said 
pixels sequentially in time over a transmission channel. 

There e?dst however more elaborated transmission schemes enabling faster and 
more reliable communication between two digital systems. Said transmission schemes are 
based on encoding said video information stream in the transmitting digital system, 
transmitting said encoded video information stream over a transmission channel and 
decoding the encoded video information stream in the receiving digital system. Note that 
the same principles can be exploited for the transmission and storage of data, e,g. to 
memory or bulk or permanent storage. There is no limit on the types of transmission 
channel, that is it can comprise a transmission channel of a Local Area Network, either 
wired or wireless, a Wide Area Network such as the Internet, the air interface of a cellular 
telephone system, etc. 

During encoding the original video information stream is transformed into another 
digital representation. Said digital representation is then transmitted. While decoding the 
original video information stream is reconstructed from said digital representation. 

For example, the MPEG-4 standard defines such an efficient encoded digital 
representation of a video information stream suitable for transmission and/ or storage. 

Encoding requires operations on the video inf oncnation stream. Said operations are 
performed on a digital system (for instance in said transmitting digital system). Such 
processing is often called Digital Signal processing (DSP). Each operation performed by a 
digital system consumes power. The way in which said operations for encoding are 
performed is called a method. Said methods have some characteristics such as encoding 
speed and the overall power consumption needed for encoding. 

Said digital system can be implemented in a variety of ways, e.g. an application- 
specific hardware such as an accelerator board for insertion in a personal computer or a 
prograirunable processor architecture. It is well-known that most power consumption in 
said digital systems, while performing real-time multi-dimensional signal processing such 
as video stream encoding on said digital systems, is due to the memory xmits in said 
digital systems and the communication path between said memory tmits. More precisely 


individual read and write operations from and to memory units by processors and/or 
datapaths and between memories become more power expensive when said memory units 
are larger, and so does the access time or latency from the busses. Naturally also the 
amotint of read and write operations are determining the overall power consumption and 
the bxis loading. The larger the commimication path the larger is also the power 
constunption for a data transfer operation. With communication is meant here the 
communication between memory imits and the processors and data paths foxmd in said 
digital system and between memories themselves. There is also a difference between on- 
and off-chip memories. Note that the same considerations are valid when considering 
speed as a performance criterion. 

As the power consumption of said digital system is dominated by read and write 
operations^ thus manipulations on data types and data structures, such as video frames, 
said methods are considered to be data-dominated. 

As the algorithm specification, the algorithm choice and its implementation 
determine the amount of operations and the required memory sizes it is dear that these 
have a big impact on the overall power consumption and other performance criteria such 
as speed and bus loading. 

A method for encoding a video information streaim, resulting in a minimal power 
constunption of the digital system on which the method is implemented, and exhibiting 
excellent performance, e.g. being fast, must be based on optimized data storage, related to 
memory sizes, and data transfer, related to the amotint of read and write operations. 

The channel between said transmitting and said receiving device always has a 
certain and usually a limited bandwidth. The amount of bits that can be transmitted per 
time unit is upper-bounded by the bandwidth available for the transmission. This 
available bandwidth may be time dependent depending upon network loads. An 
encoding method which is inefficient or which is not adaptable may result in data being 
lost or discarded or, at best; delayed. An encoding method shotdd be capable of dealing 
with such channel limitations by adapting its encoding performance in some way, such 
that less bits are transmitted when channel limitations are enforced. Said encoding method 
adaptation capabilities should again be power consumption and speed efficient. 
Performing encoding steps which, due to channel bandwidth adaptations or other 


limitations become useless and are thiis unnecessary, should be avoided. Note that said 
encoding method adaptation capabilities should be such that the quality of the transmitted 
data should be preserved as much as possible. Minimum Quality of Servive (QoS) 
requirements should be maintained. 

Naturally when such a power consxmiption and speed optimal encoding method 
exists it can be implemented on a digital system, adapted for said method. This adaptation 
can be done by an efficient programming of programmable (application specific) processor 
architectures or by designing and fabricating an application-specific or domain-specific 
processor with the appropriate memory units. This can be a stand-alone unit or may be 
included within a larger processing structure such as a computer. 

Prior art encoding methods with adaptation capabilities take into account channel 
bandwidth limitations by adapting some encoding parameters based on predictions of the 
bit rate needed, said predictions being based on historic data of said bit rate only. Said bit 
rate predictions do not take into accoimt a characterization of the current video frame to 
be encoded. Said prior art encoding method are not using a relation, also denoted model, 
relating said bit rate, characteristics of the to-be-encoded-video-frame and said encoding 
parameters [Tihao Chiang and Ya-Qin Zhang, "A New Rate Control Scheme Using 
Quadratic Rate Distortion Model", IEEE Trans, on Circuits and Systems for Video 
Technology, vol. 7, No. 1, pp. 246-250, February 1997.], [Wei Ding, and Bede Liu, "Rate 
Control of MPEG Video Coding and Recording by Rate-Quantization Modeling", IEEE 
Trans, on Circtiits and Systems for Video Technology, vol. 6, No. 1, pp. 12-20, February 
1996.]. 

Prior art encoding methods with good quality preserving properties having 
adaptation capabilities, taking into accoimt channel bandwidth limitations by adapting 
some encoding parameters, e.g. by taking into accoxmt a characterization of the video 
frame to be encoded, have severe drawbacks from the implementational point of view, 
[Jiann-Jone-Chen, and Hsueh-Ming-Hang, "Source model for transform video coder and 
its application, n. Variable frame coding.", IEEE Trans, on Circuits and Systems for Video 
Technology, vol. 7, No. 2, pp. 299-311, April 1997.], [Jordi Ribas-Corbera, and Shawmin 
Lei, "Rate Control in DCT Video Coding for Low-Delay Conununications", IEEE Trans, on 
Circuits and Systems for Video Technology, voL 9, No. 1, pp. 172-185, February 1999],[ 
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Anthony Vetro, Huifang Sun, and Yao Wang, "MPEG-4 Rate Control for Multiple Video 
Objects", IEEE Trans, on Circuits and Systems for Video Technology, vol. 9, No. 1, pp. 920- 
924, February 1999.]. 

Where the adaptation scheme does work correctly, e.g. it generates a data rate 
5 which cannot be transmitted, the system generally only has two options: discard the 
excess data or stop the processing. The latter is often impossible or undesirable as real- 
time transmission is required. The former solves the problem with data loss which has to 
be compensated by other techniques, e.g. regeneration of data by interpolation betwenn 
frames. 

10 Figure 8A shows a schematic representation of a prior art encoding scheme with a 

first encoding step (10) and a second encoding step (20) for encoding a video frame (320) 
on a time axis (300) with respect to a reference video frame (310). Said encoded current 
video frame (320) is transmitted via a bandwidth limited channel (60), being preceded 
with some buffering means (30). Potentially some video frame discarding means (50) are 

15 present in between said encoding steps (10) and (20) or before said first encoding step (70). 
Said first encoding step is executed in a block-based way. (220) represents the block loop, 
meaning that essentially all blocks are first sub-encoded before said first sub-encoding step 
is finished with said current video frame and the method moves on to a second sub- 
encoding step. Said second sub-encoding step can be executed in a similar fashion but 

20 with a different loop. Said prior-art method adapts the bit rate, taking into account 
possible buffer information (100), information about the complexity of the first sub- 
encoded video frame (140) by either adapting parameters of said second sub-encoding 
(120) or by discarding said current video frame (150). A decision circuit (40) takes this 
adaptation decision. Said first sub-encoding step possibly comprises transformation 

25 (motion) estimation and transformation (motion) compensation steps (11) and (12). Note 
that discarding (70) based on buffer (30) fullness information (170) only before first sub- 
encoding is cdso often used. No information on video frame complexity is used. 

There still remains a requirement to improve the efficiency of encoding methods 
and apparatus for streams of framed data such as video frames. In partictQar there is a 

30 need for improved adaptive encoding methods and apparatus for framed data sequences. 
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SUMMARY OF THE INVENTION 

The following reference to "a frame" also includes within its meaning "a video 
frame". 

In a first aspect of the invention a method and an apparatus are provided for 
5 encoding a current frame, said method also having adaptation capabilities in case of 
chaimel bandwidth limitations. It is especially suited from an implementational point of 
view as it is block-oriented. Said method, having a first and second sub-encoded steps 
applicable to blocks of frames, relies on a quantity of the to-be-encoded frame, more 
precisely of a first sub-encoded version of said to-be-encoded frame, said quantity being 
10 predicted from a reference frame, at least being first sub-encoded before said current 
frame. Said quantity is used for adapting encoding parameters of said second sub- 
encoding step and/ or deciding to skip said second sub-encoding step, hence skipping said 
cxurent frame. 

Within said adaptive encoding method and apparatus, partitioning or dividing 
15 said current frame into blocks and performing a first and second sub-encoding step on 
blocks can be distinguished. Said second sub-encoding step adapts its encoding 
parameters based on a quantity of said first sub-encoded part of said current frame as a 
whole being determined by prediction from a previously encoded reference frame. Said 
quantity is not determined from said first sub-encoded part of said current frame as a 
20 whole as at this stage of the encoding process said first sub-encoded part of said whole 
current frame is not available yet. Said steps are performed block-per-block of said current 
frame. 

In a second aspect of the invention a method and an apparatus are provided for 
encoding a current frame with respect to a reference frame, said method also having 

25 adaptation capabilities in case of channel bandwidth limitations, is presented. Said 
method, having a first and second sub-encoded step, relies on a quantity of the to-be- 
encoded current frame, more precisely of first sub-encoded version of said to-be-encoded 
current frame, said quantity being predicted from a reference frame, at least being first 
sub-encoded before said current frame. Said quantity is used for adapting encoding 

30 parameters of said second sub-encoding step or deciding to skip said second sub-encoding 
step entirely. Said quantity computation can be based on the labeling of blocks of said 


reference frame, said labeling being based on the performance of said first sub-encoded 
step applied to said reference frame. 

Within said adaptive encoding method and apparatus, a step of partitioning or 
dividing a reference frame into blocks, a step of labeling said blocks in accordance with the 
performance of the first sub-encoding applied to said reference frame, a step of computing 
a quantity based on said labeling of said blocks and of performing a first and second sub- 
encoding step on the to-be-encoded current frame can be distinguished. Said second sub- 
encoding step adapts its encoding parameters based on said quantity. In an embodiment 
said first and second encoding steps applied to said to-be-encoded current frame are 
performed per block of said current frame. 

The bit rate control methods in accordance with the present invention solve the 
problem of apparent incompatibility between local and sequential processing of frame 
data and use of tihe most efficient rate control algorithms, i-e. the ones that rely on rate- 
distortion models whose parameters are computed by a pre-analysis of the complete 
frame to be encoded. An embodiment of the invented method is a hybrid scheme 
comprising of motion compensation and displaced frame difference coding, avoiding the 
pre-analysis stage while keeping the benefit from rate-distortion based rate control, by 
predicting of mean absolute difference MAD of the expected prediction error. 

The present invention may provide a method of adaptive encoding at least a part 
of a current frame of a sequence of frames of framed data, comprising the steps of: 
dividing said part of said current frame into blocks; 
performing a first sub-encoding step on a block; thereafter 

performing a second sub-encoding step on said first sub-encoded block, said second sub- 
encoding step adapting its encoding parameters based on a quantity of said first sub- 
encoded part of said current frame being determined by prediction from a reference 
frame; and 

said steps are performed on another block of said part of said current frame. 

Subsequently, the steps are performed on another block of said part of said current 
frame. The adaptive encoding method is capable of taking into account chaimel bandwidth 
limitations by adapting said second sub-encoding steps encoding parameters based on said 
quantity. The method may comprise the step of transmitting said second sub-encoded 
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blocks over the channel which channel bandwidth limitations are taken into account. The 
method includes that said adaptive encoding of at least a part of said cxirrent frame is 
perfonned with respect to a reference frame, said first sub-encoding step comprising: 
performing motion estimation of a block with respect to said reference frame; 
thereafter perf onning motion compensation of said block; and thereafter 
determining the error block. 

The method can include that said adaptive encoding of at least a part of said 
current frame is performed with respect to a reference frame, said first sub- 
encoding step comprising: 

performing transformation parameter estimation of a block with respect to said 
reference frame; thereafter 

performing a trai«f ormation compensation step on aid block; and thereafter 
determining the error block. 

The method may also include that said second sub-encoding step is selected from 
tiie group comprising of wavelet encoding, quadtree or binary coding, DCT coding and 
matching pursuits. 

The present invaition may provide a method for encoding a sequence of frames of 
framed data comprising the steps of; 

determining for at least one current frame, selected from said sequence of frames 
an encoding parameter based on a quantity of said current frame being determined by 
prediction from reference frames; and thereafter 

encoding said ctirrent frame taking into accotmt at least said encoding parameter. 

The method can include that said encoding step takes into account at least one 
encoding parameter being determined directly from at least one of said reference frames. 

The method can include that said encoding parameter of said current frame and 
said encoding parameter of said reference frame are of the same type. 

The method can also include that said encoding step exploits an average of said 
encoding parameter of said current frame and said encoding parameter of one of said 
refererKe frames. 

Hie method also includes that said quantity of said current video frame is 


determined from one reference frame, said quantity determination comprises the steps of 
identifying within said reference frame a first and second region, determining a jSrst 
region quantity for said first region, a second region quantity for said second region and 
computing said quantity of said current frame from the first region quantity, said second 
region quantity and the time interval between said reference frame and said current 
frame. The method can also include that said first region is related to intra-coded parts 
and substantially non-moving parts of said reference frame and said second region being 
related to moving parts of said reference frame. 

The method can also include that said quantity is based on said second region 
quantity multiplied with said time interval. The method also includes that said quantity is 
a measure of the information content within said ctirrent frame. The method also includes 
that said quantity is a measure of the energy content within said current frame. The 
method also includes that said quantity is a measure of the complexity of said current 
frame. The method also includes that said first and second region quantities are a measure 
of the information content within said first and second region of said current frame. The 
method also includes that said measure is derived from the sum of absolute difference 
between the motion compensated current frame and the previous frame. The method also 
includes that said measure is derived from the error norm between the motion 
compensated current frame and the previous frame. The method also includes that said 
meastire is derived from the sum of an absolute difference between the first region or 
second region of the motion compensated current frame and a previous frame. The 
method also includes that said measure is derived from the error norm between first 
region or second region of the motion compensated current frame and the previous 
frame. The method also includes that the above steps are applied to parts of said current 
frame. The method also includes that said current frames are divided into blocks and said 
steps are applied on a block-by-block basis. The method further comprises selecting eidier 
that said encoding step is based on an encoding parameter based on said quantity being 
predicted or said encoding being based on a combination of said encoding parameter 
based on said quantify being predicted and said encoding parameter being determined 
directly from at least one of said reference frames. The method may also include that said 
selection is based on detection of oscillations in the generated sequence of encoding 
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parameters. 

The present invention may provide a method for encoding a sequence of frames of 
framed data, comprising the step of determining for at least one current frame of said 
sequence of frames v/hether said current frame will be selected for encoding before 

5 encoding said current frame. The method may include that said selection is based on a 
prediction of a quantity of said current frames from reference frames. 

The present invention includes a method for encoding a sequence of frames opf 
framed data comprising the step of determining for at least one current frame of said 
sequence of frames which encoding parameters will be used for encoding said current 

10 frame before encoding said current frame with said encoding parameters. The method 
may include that said determining of encoding parameters is based on a prediction of a 
quantity of said cvirrent frames from reference frames. 

The present invention may provide a method of adaptive encoding at least a part 
of a current frame of a sequence of frames of framed data with respect to a reference frame 

15 comprised in the sequence, the method comprising the steps of: 

dividing said reference frame into blocks and labeling said blocks of said reference frame 
in accordance with the performance of a first sub-encoding step applied to said reference 
frame; 

computing a quantity based on the labeling of said blocks; 

20 performing said first sub-encoding step on said current frame; 

performing a second sub-encoding step on said first sub-encoded frame, said second sub- 
encoding step adapting its encoding parameters based on said quantity. 

The present invention provides a method of adaptive encoding at least a part of a 
current frame of a sequence of frames of framed data, with respect to a reference frame 

25 comprised in the sequence, the method comprising the steps of: 
dividing said reference frame into blocks; 

performing a first sub-encoding step on said reference frame with respect to a previous 
reference frame; 

labeling said blocks of said reference frame based on said first sub-encoding steps 
30 performance and a blocks motion vector; 

determining for each of said blocks of said reference frame a measure of difference 
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between related blocks in said previous reference frame; 

computing a quantity from said meeisures of differences for said blocks and exploiting the 
labeling of said blocks; 

performing said first sub-encoding step on said current frame; thereafter 
performing a second sub-encoding step on said first sub-encode frame, said second sub- 
encoding step adapting its encoding parameters based on said quantity. The method may 
include that said computing of said quantity takes into account the time elapsed between 
said current frame and said reference frame. The method may include that said blocks of 
said reference frame have a first label v^hen said blocks are intra-coded or when said blocks 
have a substantial zero motion vector, said blocks of said reference frame have a second 
label otherwise, said computed quantity being the sum of: 

the sum of all measures of differences of blocks v^th a first label; 
a normalized sum of all measures of differences of blocks vy^ith a second label multiplied 
with the time elapsed between said current frame and said reference frame. 

The present invention will be described with reference to the following drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figxire 1 shows the quantisation parameter (QP) (selected as the to be adapted 
parameter here in this embodiment) dependency on the mean absolute difference (selected 
in this embodiment as the measure of frame complexity) for prior art pre-analysis based 
techniques (soUd lines) and the invented prediction based approach. 

Figures 2-7 shows PSNR (signal to noise ratio), buffer fullness and QP as function 
of time for a reference approach, based on pre-analysis and a poor estimator (in Figure 2), 
an estimator according to one embodiment of the invention (in Figure 3) and a further 
improved embodiment based on averaging (in Figure 4). Figure 5, 6, 7 show some more 
comparisonsas described in the text below. 

Figure 8. Figure 8A shows a schematic diagrsun of conventional encoding scheme 
with a first encoding step (10) and a second encoding step (20) for encoding a video frame 
(320) on a time axis (300) with respect to a reference video frame (310). 

Figiure 8B shows £in encoding scheme in accordance with an embodiment of the 
present invention. 


DETAILED DESCRIFnON OF THE INVENTION 

The present invention will be described with reference to certain embodiments and 
drawings but the present invention is not limited thereto but only by the claims. In 

5 particular, the present invention will mainly be described with reference to video streams 
but the present invention may be applied to other data streams such as pure audio streams 
or other forms of packetised data streams- 
One aspect of the present invention concerns a method of encoding a sequence of 
video frames. Said encoding method comprises at least two sub-encoding steps within 

10 said method. A first sub-encoding step performs a first partial encoding of a video frame 
or a part thereof. A second sub-encoding step performs a second partial encoding of the 
result of said first sub-encoding method. In accordance with an aspect of tiie present 
invention the sub-encoding steps are not performed on the video frame or parts thereof as 
a whole but on blocks of said video frame. Each sub-encoding step may comprise several 

15 steps, i.e. is a sub-encoding method in its own right. 

Depending on the content of the sequence of video frames, said encoding method 
can result in a bit stream, to be transmitted over a channel, which can vary in the amount 
of bits per time unit As a channel generally has a limited bandwidth and storage 
resources, useful for buffering the bit stream temporarily, at a device are also generally 

20 limited in size, methods within said encoding methods for dealing with such overflow 
situations are preferred. Said methods adapt the encoding performed by the encoding 
methods. Said methods can be denoted adaptation methods. A first adaptation method 
decides on whether the video frame under consideration will be skipped, meaning will not 
be encoded, hence not be transmitted. Said first adaptation method performs a so-called 

25 hard decision. A second adaptation method decides on the amount of bits that can be 
spend for a video frame while encoding. Said second adaptation method performs a so- 
called soft decision. 

Said first adaptation method, further denoted skip method, can exploit the result of 
a first sub-encoding method, which includes the decision whether the video frame will be 
30 skipped or not. This implies that the situation can occur that already performed encoding 
effort, more in partictalar first sub-encoding, has already been carried out (has taken up 


1 
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resoxirces) although it is not further needed as the video frame will not be encoded. Such 
situations are less preferred when dealing with devices with limited resources and when 
power consumption, speed and video content are to be optimized. 

Said second adaptation method, further denoted an adaptive quantization method, 
5 can exploit the result of said first sub-encoding method for determining or adapting 
parameters, e.g. the quantization parameter, to be used in said second sub-encoding 
method, in order to spend a targetted number of bits. Said adaptive quantization method 
determines said quantization parameter based on essentially all blocks of said first sub- 
encoded video frame in order to have a homogeneous quality over said frame. As both 

10 said sub-encoding methods are typically block-oriented, meaning executed for each block 
separately, said non-block oriented decision taking within said adaptive quantization 
method does not match well with said sub-encoding methods from the viewpoint of data 
transfer and storage, leading to sub-optimal power consumption. 

It should be noted that skipping or discarding before the first sub-encoding based 

15 on buffer-fullness information only and hence not taking into account complexity 
information about the video frame to be encoded is known. 

An encoding method being having adaptation methods can be denoted an 
adaptive encoding method. An adaptive encoding method may comprise tire steps of 
dividing said part of said video frame to be encoded into blocks, then performing for 

20 essentially all of said blocks a first sub-encoding step. Thereafter detennining a quantity of 
said first sub-encoded part of said video frame. Said quantity is then used for detenniiung 
the second sub-encoding steps parameters. Said second sub-encoding step is then 
executed using these parameters in a block-oriented way, like said first sub-encoding step, 
thus for essentially all block of seiid part of said video frame separately but one after 

25 another. It must be emphasized that said second sub-encoding method needs a quantity 
which is characteristic for said part of said video frame as a whole for quality homogenous 
reasons. This results in a bad data locality, meaning that said first encoding step consumes 
blocks and produces first sub-encoded blocks which need to be stored. Essentially all said 
first sub-encoded blocks are needed before the said second sub-encoding step can be 

30 started, which will consume again said first sub-encoded blocks. The adaptive nature of 
said encoding method requires adaptation of the encoding parameters, more in particular 
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said second sub-encoding steps parameters, based on the varying content of the video 
frame or part thereof to be encoded. Said adaptation is thus in principle based on a global 
quantity of said video frame or part thereof, more in particular, said first sub-encoded 
video frame or part thereof. The block-based nature of said sub-encoding methods cannot 
be exploited optimally for power consumption optimization because the adaptive nature 
of said encoding method needs a global quantity, thus a plurality of blocks is needed. 

In an embodiment of the present invention, said quantity related to a characteristic 
for said video frame or part thereof, more in particular of said first sub-encoded video 
frame or part thereof, is not determined from said first sub-encoded blocks but replaced by 
a predicted quantity. Within the meaning of .prediction is included both forward and 
backward prediction. The "quantity" is preferably a value related to the expected data 
content or data rate. Hence, this quantity preferably allows adapttation of the parameters 
of the second sub-encoding step to optimize the data rate. Said predictive determination of 
said quantity is performed before the second sub-encoding step starts. As said predictive 
determining of said quantity does not need said first sub-encoded blocks, and thus said 
predictive determining of said quantity is independent of said first sub-encoded blocks in 
this sense, both sub-encoding methods can be combined in a single loop. Thus, said first 
and said second sub-encoding step are performed after each other on a block before one 
starts said sub-encoding steps on another block. 

This embodiment of an encoding method in accordance with the present invention 
takes into account chaimel bandwidth limitations by adapting some encoding parameters 
based on a calculated value of the bit rate needed thereby taking into account a 
characterisation of the video frame to be encoded. Said invented encoding method thus 
uses a relation, also denoted model, relating said bit rate, characteristics of the to be 
encoded video frame and said encoding parameters. More particularly, said invented 
encoding method uses a prediction of a characterization of the video frame to be encoded. 
Within said model an estimate or predicted value is used. 

By using a relation between bit rate, video frame characteristics and encoding 
parameters, the invented encoding method does provide adaptive encoding capabilities 
which has improved video quality preservation. Moreover, by explicitly using an estimate 
or prediction of a characterization of the to be encoded video frame said encoding method 
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can be implemented in an efficient way. 

The invented encoding method constructs from relations between bit rate, 
encoding parameters and a characteristic of the to be encoded video frame adaptive 
encoding methods. The invented encoding method avoids the sole use of historic bit rate 

5 data for taking decisions concerning encoding parameter adaptations as is done in prior 
art predictive control approaches (which is actually unreliable) and hence the invention 
shows high quality preserving properties. Moreover the use of a predicted characteristic of 
the to- be encoded current video firame makes it especially adapted from an 
implementational point of view. 

10 Encoding of the video information stream results in the generation of another 

digital representation of said video information stream. Said another digital representation 
is preferably more efficient for transmission and/ or storage. Said encoding can be based 
on the fact that temporal nearby video frames are often quite similar except for some 
motion within the image. The arrays of pixels of temporally close video frames often 

15 contain the same luminance and chrominance information except that die coordinate 
places or pixel positions of said information in said arrays are shifted by some locations or 
distance. Shifting in place as function of time defines a motion. Said motion is 
characterized by a motion vector. Encoding of the video information stream is done by 
performing encoding of sadd video frames of said time sequence with respect to other 

20 video frames of said time sequence. Said other video frames are denoted reference video 
frames. Any video frame may be a reference frame. Said encoding is in principal based on 
motion estimation of said motion between a video frame and a reference video frame. 
Said motion estimation defines a motion vector. When the motion is estimated, a motion 
compensation is performed. Said motion compensation comprises constructing a new 

25 motion compensated video frame from the reference video frame by applying the fotmd 
motion. Said motion compensated video frame comprises the pixels of said reference 
video frame but located at .different coordinate places. Said motion compensated video 
frame can then be subtracted from the video frame tmder consideration. This results in an 
error video frame. Due to the temporal relation between the video frames said error video 

30 frame will contain less information. This error video frame and the motion estimation 
vectors are then transmitted, after performing some additional coding of the error video 
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frame. The encoding method described above has a first sub-encoding step comprising 
motion estimation and motion compensation. The result of said first sub-encoding step, 
denoted a first sub-encoded video frame, is actually the obtained error video frame. The 
further coding of said error video frame can be denoted by a second sub-encoding step. 
Recall that said motion estimation and compensation is not necessarily performed on a 
complete video frame but on blocks of said video frame. The result of said first sub- 
encoding step in a block oriaited approach is thus more precisely an error block. 

If a qtiantity of an error frame, being the result of a first sub-encoding of a video 
frame with respect to a reference video frame, is to be determined before said first sub- 
encoding is completed, then the quantity has to be predicted or estimated. Predicting said 
quantity for a video frame, also denoted current video frame, which should be at least an 
estimate of the quantity obtained as a result of an actual first sub-encoding of said current 
video frame, can take into accoimt how said quantity will change when compared v^dth the 
same quantity of the previous encoded video frame, now used as reference video frame. 
As said first sub-encoding is such that the differences between said current video frame 
and said reference video frame are extracted up to some motion between said current and 
said reference video frame, said prediction can be based on predicting how said 
differences will evolve. As said first sub-encoding is based on minimizing the differences 
between said current video frame and said reference video frame, and hence their 
quantities would in ideal case be the same, said difference evolution prediction can in fact 
try to take account failure of said first sub-encoding to inii\imize the differences. 
Alternatively, said predictive determination of said quantity of said current video frame 
can be said to be based on the performance of said first sub-encoding, more precisely on 
an estimate or prediction of how said first sub-encoding would actually perform when 
applied to said current video frame. 

In an embodiment of the invention said predictive determination of said quantity 
of said current video frame, being based on a predicted performance of said first sub- 
encoding when applied to said current video frame, exploits the actual performance of 
said first sub-encoding when applied to a previous video frame, used as reference video 
frame. The previous frame can be the next previous frame or another previous frame. 

A particular embodiment of the present invention comprises an exploitation of the 
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actual performance of said first sub-encoding when applied to the previous video frame 
with a first sub-encoding being based on motion estimation and motion compensation. In 
this embodiment after having performed said first sub-encoding on a previous video 
frame, for essentially all the blocks into which the previous video frame is divided or 

5 partitioned, conclusions can be drawn in respect to said first sub-encoding performance 
for each block separately. When said first sub-encoding performance of a block is below a 
certain level, it can be decided to encode that block itself instead of the related error block. 
Such a block is labeled then as an intra-coded block. For the related block in the current 
still to be encoded video frame, it can be assumed in this context, that said first sub- 

10 encoding will have the same performance, hence the quantity determined for the block in 
said previous frame can be taken as a good estimate of the quantity of the block in said 
current frame. When said first sub-encoding based on motion estimation has found for a 
block in said previous frame, a zero motion vector with an acceptable sub-encoding 
performance, it can be assumed to be in a still image part of the video frame sequence. It 

15 can be assumed in this context (at least as an approximation), that said first sub-encoding 
will find the same zero (or near-zero) motion vector for the related block in the current 
video frame, and hence the quantity of the related block in both frames will reniain the 
same (or approximately the same). For a block of the previous video frame which is not- 
intra-coded and has a non-zero motion vector, first sub-encoding failure is due to a 

20 mismatch between the actual changes between the current video frame and the reference 
video frame and the translational model for which the first sub-encoding tries to 
compensate. The contribution to the current video frame quantity of such non-intracoded 
moving blocks due to movement error is due to the dynamic nature time dependent, more 
particular related to or dependent on, more precisely proportional to the time elapsed 

25 between the current video frame and the reference video frame. 

Note that although the above described similarity despite some motion of video 
frames appears only in ideal cases, it forms the basis of encoding based on a translational 
motion model. The transformation between a video frame and a temporally close video 
frame can also be a more complicated transformation. Such a complicated transformation 

30 can form the basis of a more complicated encoding method. Within such a more general 
approach the method may include a step of determining the parameters of the assimied 
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applied transformation by estimation, performing an associated compensation for said 
transformation by applying said transformation on the reference frame and determining 
an error video frame. Said first sub-encoding method then comprises transformation 
parameter estimation, performing transformation compensations and determining an 
error video frame. Such a complicated or more global transformation can comprise various 
digital processing stapes, such as translation, rotation, scaling, zoom and/ or any 
combination thereof. 

Alternatively formulated, said adaptive encoding method comprises a step of 
identifying within said reference video frame a first and a second region, deteriniiung a 
first region quantity for said first region, a second region quantity for said second region 
and computing said quantity of said current video frame from the first region quantity, 
said second region quantity and the time interval between said reference video frame and 
said current video frame. Said first region is related to intra-coded parts and substantially 
non-moving parts of said reference video frame and said second region is related to 
moving parts of said reference video frame. The final reference quantity is based on said 
second region quantity multiplied by said time interval. 

In the above described encoding methods the use of reference video frames is 
presented but the invention is not limited thereto. As long as a first sub-encoding method 
of any kind and a second sub-encoding method of any kind can be distinguished and that 
the parameters of said second sub-encoding method need to be adaptable for resource 
constraints in general and channel bandwidth limitations in particular and said adaptation 
is in principle depending on the intermediate result obtained after said first sub-encoding, 
the invented metiiod, wherein said adaptation is based not directiy from said intermediate 
result but based instead on predictions, can be applied. Said prediction of a quantity, to be 
used in second sub-encoding a video frame, is based on the performance of said first sub- 
encoding when applied to a previous video frame. 

It is not expected that there is any limit on the application of the described 
encoding methods in accordance with the present invention. For example, they can be 
applied to all types of framed data streamis, including video, audio, 3-D multi-media data 
or combinatioris thereof. Said framed data encoding method uses prediction of a quantity, 
to be used in second sub-encoding of a frame, said prediction being based on the 
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performance of a first sub-encoding step applied to a previous frame. 

Said second sub-encoding methods can be based on wavelet transforms, matching 
pursuits/ tree coding such as quadtree or binary tree coding or DCT (Direct Cosine 
Transformation) or similar. In the wavelet transform approach the nyumber of levels/ the 
quantization step or number of bitplanes, the indexes of coded coefficients can be 
adaptable parameters. Within matching pursuits approaches the dictionary, the amoxmt of 
atoms selected aroimd a macroblock, quantization parameters can be adaptable 
parameters. Within a tree coding approach thresholds and other parameters guiding the 
building of the tree representation can be adaptable parameters. When using meshes in 2- 
D or 3-D representations, the number of vertices and nodes to be used in said meshes can 
also be adaptable parameters. 

A block-oriented adaptive encoding method inn accordance with an embodiment 
of the present invention can be formalized as follows: 

It is a method of adaptive encoding at least a part of a frame of a stream of framed 
data comprising the steps of: 
dividing said part of said frame into blocks; 
performing a first sub-encoding step on a block; thereafter 

performing a second sub-encoding step on said first sub-encoded block, said second sub- 
encoding step adapting its encoding parameters based on a quantity of said first sub- 
encoded part of said frame being determined by prediction from a reference vframe; and 
said steps are performed on another block of said part of said frame. The frame may be a 
video frame. Subsequently the method continues wherein said steps are performed on 
another block of said part of said frame. 

Note that in the above described method only the determining of said quantity is 
related to the reference video frame. 

The present invention also provides an adaptive encoding device being capable of 
taking into account resouarce limitations such as channel bandwidth limitations by 
adapting said second sub-encoding steps encoding parameters based on said quantity. 
The step of transmitting said second sub-encoded blocks over the channel can include 
taking into account channel bandwidth limitations dynamically, e.g. in the selection of 
encoding parameters. 
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Said first sub-encoding step can comprise/ but is not limited to, performing motion 
estimation of a block with respect to said reference video frame and thereafter performing 
motion compensation of said block; and thereafter determining the error block, or more in 
general performing transformation parameter estimation of a block with respect to said 

5 reference video frame; thereafter performing a transformation compensation step on said 
block; and thereafter determining the error block. The motion estimation/ compensation 
approach and the generalised method described here emphasizes that said first sub- 
encoding is related to a reference video frame but the invention is not limited thereto. 

Said second sub-encoding can be selected from the group comprising of wavelet 

10 encoding, quadtree or binary coding, DCT coding and matching pursuits or similar. 

In the above a reference video frame has been introduced as the reference video 
from which said quantity will be computed whereon adaptation is based. In some 
embodiments of the present invention said reference video frame is also the reference 
frame for the first sub-encoding step. It is clear that ihe mettiod is applicable to encoding a 

15 sequence of video frames. The method comprises then the following steps: 

determining for at least one current frame, selected from said sequence of frames 
an encoding parameter based on a quantity of said current frame, this determination being 
performed by prediction from a reference frame which is also selected from said sequence 
of frames and thereafter encoding said current frame taking into accoxmt at least said 

20 encoding parameter. The frame may be a video frame. Generally, during the course of 
encoding said sequence other frames will be selected as current and as reference video 
frame. 

A method in accordance with the present invention can be generalized as: 
determining an encoding parameter based on a quantity of a cxurent frame, the encoding 

25 parameter being determined by prediction from a plurality of reference frames, also 
selected from said sequence of frames. The frame may be a video frame. In some 
embodiments said encoding takes into account at least one encoding parameter being 
determined directly from at least one of said reference frames. In another embodiment 
said encoding exploits an average of said encoding parameter of said current frame 

30 predicted and said encoding parameter of one of said reference frames. In another 
embodiment the following is selectably used: either said encoding being based on an 
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encoding parameter based on said quantity being predicted or said encoding being based 
on a combination of said encoding parameter based on said quantify being predicted and 
said encoding parameter being determined directly from at least one of said reference 
frames. Said selection can be based on detection of oscillations in the generated sequence 

5 of encoding parameters. 

The rrature of said quantity, determined by prediction, can be described as a 
measiire of the information content or energy or complexity within said current video 
fr2ime. In a more particular embodiment said measure is derived from the sum of an 
absolute difference between the motion compensated current video frame and the 

10 previous video frame. For example, said measure is derived from the error norm between 
the motion compensated ciurent video frame and the previous video frame. 

The varying content of the sequence of video frames and the bandwidth limitations 
of the transportation channel can result either in quantization adaptations in said second 
encoding step or in complete skipping or discarding video frames, meaning not 

15 transmitting them or thus not performing said second encoding step. Discarding video 
firame is a so-called hard decision and can be imderstood as a kind of adaptation method 
in accordance with the present inventioru The skip or discard method can exploit the 
result of said first sub-encoding method, for deciding whether the video frame will be 
skipped or not. In order to avoid an unnecessary first sub-encoding of a video frame that 

20 will not be further encoded and transmitted, the result of said first sub-encoding method is 
in an embodiment of the invention not used directly but a quantity of said result is 
predicted or estimated. This embodiment of the invention thus presents a method for 
encoding a sequence of frames of framed data comprising the step of determining for at 
least one current frame of said sequence of frames whether said current frame will be 

25 selected for encoding before encoding said current frame. Said selection is based on a 
prediction of a quantity of said current frames from reference frames. The frames may be 
video frames. 

Note that said soft adaptation methods for encoding a sequence of frames of 
framed data can also be described as comprising the step of determining for at least one 
30 current frame of said sequence of frames which encoding parameters will be used for 
encoding said current frame before encoding said ctirrent frame with said encoding 
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parameters. Said determining of encoding parameters is based on a prediction of a 
quantity of said current frames from reference frames. The frames may be video frames. 

Recall the second aspect of the invention showing a method for encoding a video 
frame with respect to a reference video frame, said method both having adaptation 
capabilities in case of chaimel bandwidth limitations. Said method, having a first and 
second sub-encoded step, relies on a quantity of the to be encoded video firame, more 
precisely of first sub-encoded version of said to be encoded video frame, said quantity 
being predicted from a reference video frame, at least being first sub-encoded before said 
video frame. Said quantity is used for adaptatibng encoding parameters of said second sub- 
encoding step. Said quantity computation is based on the labeling of blocks of said 
reference video frame, said labeling being based on the performance of said first sub- 
encoded step applied to said reference video frame. 

Within said adaptive encoding method, a step of partitioning or dividing a 
reference frame into blocks, a step of labeling said blocks in accordance with the 
performance of the first sub-encoding applied to said reference frame, a step of computing 
a quantity based on said labeling of said blocks and performing a first and second sub- 
encoding step on the to be encoded frame can be distinguished. The frames can be video 
frames. Said second sub-encoding step adapts its encoding parameters based on said 
quantity. In an embodiment said first and second encoding steps applied to said to be 
encoded frame are performed per block of said frame. 

In the following, the current frame is the frame whose quantization parameter is 
being determined by the current operation of the bit rate control system. It is important to 
indicate which information is available to the control system and how the information can 
be transformed into estimations of coding results by means of a stochastic model. As 
explained above, the rate-distortion based methods proposed in the literature rely on a 
pre-analysis, more in particular a model exploiting parameters determined by pre-analysis 
of the to be encoded video frame. Motion compensation is applied on the entire frame 
before the control parameters are fixed in prior-art methods. It is because a measure of the 
cxirrent DFD (Displaced Frame Difference) complexity has to be provided to the control 
system. Most often, the measure of complexity is the mean absolute difference (MAD), i.e. 
the average of the absolute DFD pixel values. In the present invention the same R-D based 
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control schemes can be used although not limited thereto but without pre-analysis. That 
means that, in accordance with an embodiment of the present invention, it is estunated, i.e. 
predicted, the parameter(s) that was (were) previously fixed by the pre-analysis. Now an 
embodiment disclosing a way to predict the MAD of the current frame, only using 
information from the past, is presented. 

The mean absolute difference (MAD) of the current frame is predicted from 
information collected about the performances of the motion compensation of a previous 
frame. Three types of MB's are defined in the previous frame: (I) INTRA MBs. They are 
coded on their own, wifliout reference to past or future frames. They appear in parts of the 
video scene where correlation is poor between successive frames. For the current frame, 
INTRA MBs are expected to be localized in the same parts of the scene as in the previous 
frame. It is thus reasonable to assume that the INTRA MB's contribution to the sum of 
absolute difference (SAD) is more or less constant for successive frames. (D) INTER MB's 
that are predicted with zero motion vectors (MV). They are often localized in still areas* 
For these, the prediction error is mainly due to quantization errors in the reference frame. 
It does not depend on the block translation motion model. It can be eissimried that in these 
areas of the scene, the prediction error does not depend on the temporal distance between 
the reference and the predicted frame. So, for successive frames, the contribution to the 
SAD for these blocks is assumed to be constant (HI) INTER MB's that are predicted by a 
non-zero displacement. For these MB's, localized in moving areas, prediction errors 
mainly occur when the translational motion model does not fit the actuaQ motion. Let us 
assume that the components of the motion that do not fit the translational model at time t- 
1 remain constant in a near future. Then, the displacement of the objects due to these 
components is proportional to the temporal distance between two samples of the scene, i,e. 
the time elapsed between two frames. Assuming that the area covered by the prediction 
error is proportional to this non-translational displacement, the contribution to the SAD is 
roughly estimated in proportion to the time passed between two frames, i.e. to the number 
of skipped frames if the interval between two frgimes is constant. The prediction of the 
current MADt results from the above definition and considerations. 

The above can be summarized by using the following formula:' 
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where: 


t (t-1) refers to the current (previous) frame, 
N is the number of pixels of the frame, 

Skipt (t'l) is the number of frame(s) skipped before encoding the current 


(previous) frame. 


SAD refers to the sum of the absolute differences on a MB. Two classes of MB's 


are considered: the ones for which the compensation error magnitude is likely to be 
constant for successive frames (Intra-mode or Inter-mode without motion), and the ones 
for which the compensation error increases with the skip factor (Inter-mode with motion). 

It should be emphasized that the above formula is only one example. The invention 
encompasses within its scope all types of formulas which are inspired by the above 
considerations. 

Due to the delay introduced by the prediction, two paths appear in the 
dependency graph that links the QP values and the MAD values along the sequence (see 
Figure 1). These two pattis can be synchronized by using the average of the previous 
(available) and cxuxent (predicted) MAD instead of the current (predicted) MAD. Actually, 
the average replaces the predicted value only when an oscillation appears, i.e. when 
\QP,_^ - QP,,2\ > \QPt^i -fi^z-al- Note that MADt depends on QPtt-i because QPtt-i impacts 
the quality of the reconstructed frame at t-1 and so the prediction error at time t. Due to 
the use of a pre-analysis approach to fix the QP the QPtt depends on MADt while in the 
invented approach the QP depends essentially on MADt-i when the complexity of the 
current frame is predicted from the previous frame. 

The behavior of an R-D based rate control can be compared when an actual (as in 
the prior-art) or a predicted (as in the invention) frame complexity measure is used in 
order to validate the proposed prediction scheme. It illustrates one of the possible 
applications of the scheme. Nevertheless, note that the ability to predict the complexity of 
upcoming frames enables control of other parameters than the quantization scale. For 
example, in contrast to MPEG-2 broadcast application that uses a group of picture 
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Structure for which the frame interval is fixed, H.263 and MPEG-4 do allow variable frame 
skip. It is up to the rate control algorithm to decide on both spatial (quantization step) and 
temporal (frame rate) coding parameters. Skipping frames allows preserving mough bits 
for every coded picture and can ensure an almost constant picture quality along the 
sequence. For instance one can self-adjust the frame rate according to both the current 
picture content and the buffer status. This approach clearly out-performs methods that 
skip frames based only on buffer fullness. Nevertheless, in prior-art mefliods this again 
requires pre-analysis of all the frames that are likely to be coded. Being able to predict, as 
in the present invention, the complexity of a future P frame as a function of the skip factor 
strongly simplifies the algorithm. Indeed, combined with a rate distortion model of the 
coder, it permits the fixing of the minimal skip factor providing the required quality. 
Therefore, the present invention is not limited to adapting the quantization parameter but 
includes all type of adaptations of a second encoding step that can be partly influenced by 
predicted quantities. 

Such a method can be formalized as follows: 
A method of block-oriented adaptive encoding at least a part of a frame of a sequence of 
framed data with respect to a reference frame of tihte sequence comprising the steps of: 
dividing said reference frame into blocks; 

performing a first sub-encoding step on said reference frame with respect to a previous 
reference frame; 

labeling said blocks of said reference frame based on said first sub-encoding steps 
performance and said blocks motion vector; 

determining for each of said blocks of said reference frame a measure of a difference 
between related blocks in said previous reference frame; 

computing a quantity from said measures of differences for said blocks and exploiting the 
labeling of said blocks; 

performing said first sub-encoding step on a block of said frame; thereafter 
performing a second sub-encoding step on a block of said first sub-encoded frame, said 
second sub-encoding step adapting its encoding parameters based on said quantity. The 
frames may be video frames. 

The method recited above may further include that said computing of said 
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quantity takes into account the time elapsed between said current frame and said 
reference frame. 

The method recited above may further include that said blocks of said reference 
frame have a first label when said blocks are intra-coded or when said blocks have a 
substantial zero motion vector, said blocks of said video frame have a second label 
otherwise, said computed quantity being the sum of: 

the sirai of all measures of differences of blocks with a first label; 
a normalized sxun of all measures of differences of blocks with a second label multiplied 
with the time elapsed between said current frame and said reference frame. The frames 
may be video frames. 

In the following, a conventional R-D based rate control algorithm is considered. 
Either a computed or a predicted complexity measure is used as model parameter. The 
performances are compared in both cases. The relevance of the proposed prediction 
scheme is deduced. 

The performance of the invented encoding method is now illustrated by analysing 
an embodiment thereof. The quantization parameter (QP) for an entire and single video 
object is selected to be the to-be-adapted parameter of said second encoding step. 
Nevertheless, in accordance with the present invention, the approach can be extended to 
multiple video objects or to macro-block level QP selection. The rate control model is the 
one proposed by the MPEG-4 scalable control scheme (SRC) but the invention is not 
limited thereto. It is scalable for various bit rates, and spatial or temporal resolutions. The 
SRC assumes that the encoder rate distortion function can be modeled as: 


R = ^^^ (2) 


aC J3C 
Q Q' 

K is the targeted encoding bit count. C denotes the complexity of the encoded frame, i.e. 
the mean absolute difference (MAD) of the prediction error for P frames. The quantization 
parameter is denoted as Q. The modeling parameters are denoted as a and fi. They are 
defined based on the* statistics of past encoded frames. Because of the generality of the 
assumption, the SRC is largely applicable. Note also that, for all the resiilts provided 
below, the buffer size corresponds to half a second and frame are skipped whea the buffer 
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level achieve 80% of the buffer size. 

Now for one sequence and one target bit rate, the reference rate control algorithm 
(using pre-analysis) with the invented predictive scheme, i.e. a scheme for which the pre- 
analysis (current frame MAD computation) is replaced by a prediction step. 

In Figure 2, the previous MAD is used as an approximation of the current MAD, so 
there is no intention to relate said quantity to the complexity of the current video frame 
nor to use time difference between said reference and said current frame. This very simple 
prediction method is unfortunately not accurate enough. It severely degrades the system 
performances. Lots of frames are skipped (drops in the PSNR graph). 

In Figure 3, the behavior of the motion compensation on the previous frame is 
taken into accoimt according to the Equation (1). Said equation predicts MADt. thereby 
implementing the approach of the invention. It improves the control so that fewer frames 
are skipped. 

Due to the existence of two dependency paths oscillations can appear along ttie 
sequence of selected QP and engendered MAD. In Figure 4, the current (predicted) MAD 
and the previous (computed) MAD are averaged when oscillations appear, which is a 
further embodiment of the present invention. The rate control performances are similar in 
both cases. 

Additional comparisons are provided in Figures 5 to 7. On Figure 5, an Akiyo 
sequence is coded at 50 kbits/ s. Solid lines refer to the reference conventional scheme, i.e. 
the rate control with pre-analysis. Dotted lines refer to the predicted MAD. At the top, the 
very simple prediction is used. Current MAD is about the same as previous MAD. Again, 
it causes a poor control of the buffer level, which results in lots of skipped frames (large 
drops in the PSNR graph). On the bottom of figure 5, the invented scheme shows very 
similar performance as the reference one. Buffer level is controlled with similar 
performances in both schemes, so that the same number of frames is skipped for both 
schemes. In Figures 6 and 7, two sets of parameters are considered for the Foreman QdF 
sequence. In Figure 6, the first hundred frames of the sequence are encoded at 50 kbits/ s. 
In Figure 7, 300 frames are encoded at 200 kbits/s. At the top of these figures, buffer level 
and PSNR curves are provided both for flie reference scheme (solid lines) and the simple 
prediction scheme (dotted lines). On the bottom, the reference scheme is compared with 


28 

the proposed invented one (Equation (1) and average when oscillations appear). Again, 
one can conclude that the proposed predictive scheme enables a better control of the 
buffer level than the simple one and achieves performances that are similar to the 
reference one. 

Note that more complex models adapt the QP on a macro-block basis. In addition 
to a measure of the average energy of the frame to encode, the R-D model is also 
parameterized by the energy of the current macro-block. This parameter is available in a 
one pass encoding process. So, being able to predict the average energy of the frame is 
enough to apply such MB-based R-D model to spatially localized encoding schemes. 
Nevertheless, it is worth noting that one could go further than just generalizing existing 
schemes. The prediction introduces latency in the control of the quantization parameter 
(QP). For example, a scene cut can only be detected once the considered frame has been 
coded. The ability to change the QP on a MB basis increases the freedom and flexibility of 
the rate control. It allows adapting the QP to the local complexity of the scene and 
achieving the target bit coimt more accurately. It should also enable a faster reaction to 
sharp scene changes. The scene change could be taken into accoxmt on tihe fly, while 
sequentially dealing with the blocks of the frame. 

Note also that in this illustration only so-called P frames are involved. Naturally, B 
frames coding parameters can be treated in the same way. However, since B frames 
usually need fewer bits to code, extra efforts for the modeling are not always justified. A 
simple weighted average of quantization scales of its two anchor frames can be used 
alternatively. 

Figure 8B shows an embodiment of an adaptive encoding circuit in accordance 
with the present invention. Figure 8B shows a schematic representation of an encoding 
apparatus with first sub-encoding circuit (10) and second sub-encoding circuit (20) for 
encoding a video frame (320) with respect to a reference video frame (310) of a sequence of 
frames on a time axis (300). The current video frame (320) in its encoded form is to be 
transmitted via a bandwidth limited channel (60), preferably being preceded by a 
buffering circuit (30). Optionally, a video frame discarding circuit is present in between 
said first and second sub-encoding circuits (10) and (20) but it is preferred if a skipping or 
discarding circuit (70) is located before said first sub-encoding circuit (10). Said first and 
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second sub-encoding steps are executed in a block-based way. Reference number (400) 
represents the block loop meaning that each block is processed before the next block. 
Hence, first and second sub-encoding circuits (10, 20) preferably execute in a similar 
fashion. The decision on how to adapt the bit rate is made a decision circuit (40). This 
5 adaptation decision optionally takes into accoiint the output of buffer information (100) 
from the buffer circuit (30) and/ or information referring to, and obtained by analysis, of 
the complexity of the first sub-encoded video frame (140) from the first sub-encoding 
circuit (10). The adaption may be performed either by adapting parameters (120) of said 
second sub-encoding circuit (20) or by discarding Scdd current video frame, e.g. in a 

10 skipping or discarding circuit (70). Said first sub-encoding circuit (10) optionally 
comprises circuits or means for transformation (motion) estimation and transformation 
(motion) compensation (11) and (12). Note that discarding based on buffer fullness 
information from said buffer circuit (30) only before first sub-encoding circuit (10) is also 
included within the present invention. 

15 In accordance with an embodiment of the present invention the decisiori circuit 

(40) does not take into account in a direct way information of said whole first sub-encoded 
video frame (as this would be obtained too late in the process to be really useful). Instead 
it comprises means or a circuit to compute a quantity, which is assumed to be related to 
said whole first sub-encoded video frame, from a reference video frame (310) as indicated 

20 by line (500). This qucintity, its definition, calculation and uses have been discussed in 
detail above. Note that the computation within said decision circuit (40) can use 
information from a skipping means. More in particular the skipping means can determine 
the time distance between the current video frame and a previously encoded reference 
video frame. Also note that line (500) only indicates which information is used for quantity 

25 computation, not where the information is actually stored. Indeed, in an embodiment of 
the present invention said reference video frame has already been first sub-encoded before 
said current video frame and it is the irtformation obtained therefrom which is used for 
partitioning, labeling and actual computing. As said decision circuit (40) does not need 
said first sub-encoded video frame (140) as a whole, both ssdd first and second sub- 

30 encoding circxiits(10, 2220) can be merged within one loop (400), meaning that first and 
second sub-encoding of a block is performed before another block is processed. Moreover 
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discarding in the discarding or skipping circuit (70) before first sub-encoding can now 
(optionally: also) be based on an estimated information (190) about the current video 
frame. Note that the computation within said decision circtiit (40) can still use information 
(180) from the skipping or discarding circuit (70), more in particular the time distance 
between the video frame and the previously encoded reference video frame. Hence, the 
skipping or discarding circuit (70) may comprise means for determining this time 
difference. 

The adaptive encoding circuit described above may take into account channel 
bandwidth limitations in the adaptive process, e.g. by adapting the second sub-encoding 
parameters. This adaption may be included by altering the quantity used by the circuit for 
adaptation of the parameters in accordance with channel properties or limitations. 
The first sub-encoding circuit (10) may perform transformation parameter estimation of a 
block with respect to said reference frame followed by performing a transformation 
compensation step on said block and thereafter determining the error block. 

The second sub-encoding circuit (20) may perform an encoding selected from the 
group comprising of wavelet encoding, quadtree or binary tree coding, DCT coding and 
matching pursuits coding or similar. 

The first sub-encoding circuit (10) may include a division circuit for dividing the 
current frame or a reference frame into blocks. This division circuit may also be an 
independent circuit. This division circuit (independent of its implementation) may label 
the blocks of said reference frame in accordance with the performance of the first sub- 
encoding step applied to the reference frame. The decision circuit can then compute the 
quantity based on the labeling of said blocks which is then used for adapting the encoding 
parameters of the second sub-encoding circuit (2), The computing of the quantity may also 
take into account the time elapsed between the current frame and the reference frame as 
determined from the skipping circuit (70). 

The labeling by the division circuit can be carried out in the following way: the 
blocks of the reference frame are given a first label when said blocks are intra-coded or 
when said blocks have a substantial zero motion vector and, otherwise, the blocks of said 
reference frame have a second label. The computed quantity can be formed by the sum of: 
the sum of all measures of prediction errors of blocks with a first label; a normalized sum 
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of all measures of prediction errors of blocks with a second label multiplied with the time 
elapsed between said current frame and said reference frame. 

A further embodiment of the present invention includes an adaptive encoding 
circuit in which at least a part of a current frame of a sequence of frames of framed data is 
processed with respect to a reference frame comprised in the sequence, further comprising 
a division circuit for dividing the reference frame into blocks and labeling the blocks of the 
reference frame in accordance with the performance of a first sub-encoding step applied to 
said reference frame by the first sub-encoding circuit. A decision circuit then computes the 
quantity based on the labeling of the blocks and decides, based on the quantity, to perform 
or skip encoding of the current ff aixie. 

The above encoding circuit(s) may be implemented as a self-contained unit such as 
an accelerator card for a personal computer or may be implemented on a computing 
device such as a personal computer or server as is known to the skilled person by 
programming the method steps in software. Hence, the word ''circuit" should be 
understood in the widest sense and includes implementation in either softweire or 
hardware. The computer or server may be programmed to carry out any of the method 
steps or any combination of steps described above (especially the combinations described 
in the attached claims and described in the Summary of the Invention, above) in 
accordance with the present invention. Alternatively, these same method steps or 
combination of method steps may be programmed into a dedicated processor as may be 
used on a board or card for insertion into a server, a computer or the node of a 
telecommvmications network. The combination of dedicated and programmable elements 
in one or more of the circuits described above may be advantageous, e.g. the use of 
programmable digital elements such as programmable gate arrays, especially Field 
Programmable Gate Arrays, PAUs, PLA's, etc. can provide factory or field 
programmablity and allow updates to encoding algorithms without change of hardware. 

The present invention may also be used as part of a telecommunications system 
that is any system capable transmitting and receiving signals, such as, but not limited 
thereto, a computer, a telephone system, a Local Area Network, a Wide Area Network, the 
Internet, a mobile telecoixraumications system, a cellular telephone system, a Metropolitan 
Access network, a satellite conmiunication system radio or television. 
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CLAIMS 

1. A method of adaptive encoding at least a part of a current frame of a sequence of frames 
of framed data comprising the steps of: 

5 dividing said part of said current frame into blocks; 

performing a first sub-encoding step on a block; thereafter 

performing a second sub-encoding step on said first sub-encoded block, said second sub- 
encoding step adapting its encoding parameters on a quantity of said first sub-encoded 
part of said current frame being determined by prediction from a reference frame; and 
10 said steps are performed on another block of said part of said current frame. 

2. The method recited in 1 wherein said steps are performed on another block of said part 
of said current frame in said order. 

15 3, The method recited in claim 1 or 2, wherein the encoded frames are to transmitted over 
a transmission channel and said adaptive encoding method takes into account channel 
bandwidth limitations by adapting said second sub-encoding steps encoding parameters 
based on said quantity, 

20 4. The method recited in any of claims 1 to 3, wherein said adaptive encoding of at least a 
part of said current frame is performed with respect to a reference frame, said first sub- 
encoding step comprising of: 

performing transformation parameter estimation of a block with respect to said reference 
frame; thereafter 

25 performing a transformation compensation step on said block; and thereafter 
determining the error block. 
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5. The method recited in 1, 2 or 4, wherein said second sub-encoding is selected from the 
group comprising of wavelet encoding, quadtree or binary tree coding, DCT coding and 
matching pursuits codings 


6. A method of adaptive encoding at least a part of a current frame of a sequence of frames 
of framed data, with respect to a reference frame comprised in said sequence, the method 
comprising the steps of: 

dividing said reference frame into blocks and labeling said blocks of said reference frame 
in accordance with the performance of a first sub-encoding step applied to said reference 
frame; 

computing a quantity based on the labeling of said blocks; 

performing said first sub-encoding step on said current frame; 
performing a second sub-encoding step on said first sub-encoded frame, said second sub- 
encoding step adapting its encoding parameters based on said quantity. 

7. The method recited in claim 6 wherein said computing of said quantity takes into 
accoimt the time elapsed between said current frame and said reference frame. 

8. The method recited in claim 6 or 7, wherein said blocks of said reference frame have a 
first label when said blocks are intra-coded or when said blocks have a substantial zero 
motion vector, said blocks of said reference frame have a second label otherwise, said 
computed quantity being the sum of: 

the sum of all measures of prediction errors of blocks with a first label; 
a normalized sum of all measures of prediction errors of blocks with a second label 
multiplied with the time elapsed between said current frame and said reference frame. 

9. An apparatus for adaptive encoding of a part of a oirrent frame of a sequence of frame 
of framed data, comprising: 

an encoder capable of performing first and second sub-encoding steps on a block of 
said current frame and for adapting encoding parameters of said second sub-encoding 
step based on a quantity related to the block of said current frame after it has been first 
sub-encoded; and 

a decision circuit capable of determining said quantity by prediction from a 
reference frame. 
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10. An apparatus for adaptive encoding at least a part of a current frame of a sequence of 
frames of framed data, with respect to a reference frame comprised in said sequence, 
comprising: 

an encoder for first sub-encoding said reference ixame; 

5 means for dividing said reference frame into blocks and labeling said blocks of said 
reference frame in accordance with the output of said encoding circuit; 
means for computing a quantity based on the labeling of said blocks; 
an encoder for performing said first sub-encoding step on said current frame; 
an encoder for performing a second sub-encoding on Scdd first sub-encoded frame and 

10 means for adapting the encoding parameters of the encoding circuit for said second sub- 
encoding based on said quantity. 

11. A method of implementing a two step encoding method, said two step encoding 
method comprising of a first sub-encoding step and a second sub-encoding step, said two 

15 step encoding method being applied to a current frame of a sequence of frames of framed 
data, said method comprising the steps of: 

performing a decision step, said decision step being based a quantity of the current frame 
that would be obtained when applying said first sub-encoding step to said current frame, 
said quantity being determined by prediction from a reference frame, said decision step 
20 deciding whether said two step encoding method will be applied to said current firame or 
not. 

12. The method recited in claim 11, wherein the encoded frames are to be trai\smitted over 
a channel and said method is capable of taking into account channel bandwidth 

25 limitations- 

13. A method of adaptive encoding at least a part of a current frame of a sequence of 
frames of framed data, with respect to a reference frame comprised in the sequence, the 
method comprising the steps of: 

30 dividing said reference frame into blocks and labeling said blocks of said reference frame 
in accordance with the performance of a first sub-encoding step applied to said reference 


frame; 

computing a quantity based on the labeling of said blocks; 

deciding based on said quantity to perform or skip encoding said current frame; 

in case said encoding is performed, performing said first sub-encoding step on said 

current frame and a second sub-encoding step on said first sub-encoded frame, 

14. A method of adaptive encoding at least a part of a current frame of a sequence of 
frames of framed data with respect to a reference frame comprised in said sequence, the 
method comprising the steps of: 

dividing said reference frame into blocks and labeling said blocks of said reference frame 
in accordance with the performance of a first sub-encoding step applied to said reference 

frame; 

computing a quantity based on the labeling of said blocks; 
dividing said current frame into blocks; 

performing said first sub-encoding step on a block of said current frame; 
performing a second sub-encoding step on said first sub-encoded block of said current 
frame, and adapting, in said second sub-encoding step; the encoding parameters thereof 
based on said quantity. 

15. An apparatus for implementing a two step encoding of a current frame of a sequence 
of freunes of framed data, the two step encoding comprising a first sub-encoding and a 
second sub-encoding step, comprising: 

means for calculating a quantity of the current frame by prediction from a reference frame 
of the quantity that would be obtained when applying said first sub-encoding to said 
current frame; 

a decision circuit for deciding, based said quantity whether said two step encoding will be 
applied to said cxuxent frame or not. 

16. An apparatus for adaptive encoding of at least a part of a current frame of a sequence 
of frames of framed data with respect to a reference frame comprised in the sequence, 
comprising: 


an encoder for applying a first sub-encoding step to said reference frarae; 

means for dividing said reference frame into blocks and for labeling said blocks of said 

reference frame in accordance with the output of the encoding circuit; 

means for computing a quantity based on the labeling of said blocks; 

means for deciding based on said quantity to perform or skip encoding of said current 
frsmne; and 

an encoder for performing said first sub-encoding on said current frame and 

an encoder for performing a second sub-encoding on said first sub-encoded frame in 

response to the decision circuit determining that said encoding is performed. 

17. An apparatus for adaptive encoding of at least a part of a current frame of a sequence 
of frames of framed data with respect to a reference frame comprised in said sequence, 
comprising: 

an encoder for applying a first sub-encoding to said reference frames 

means for dividing said reference frame into blocks and labeling said blocks of said 
reference frame in accordance with the output of the encoding circuit; 
means for computing a quantity based on the labeling of said blocks; 
means for dividing said current frame into blocks; 

an encoder for performing said first sub-encoding on a block of said current frame; 

an encoder for performing a second sub-encoding on said first sub-encoded block of said 

current frame, and 

means for adapting the encoding parameters of said second sub-encoding circuit based on 
said quantity. 
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ABSTRACT 

A METHOD AND APPARATUS FOR ADAPTIVE ENCODING FRAMED DATA . 

SEQUENCES 

Methods and apparatus for adaptive encoding of at least a part of a current frame of a 
sequence of frames of framed data are described which operate on a block-by-block 
coding basis. The methods and apparatus divide at least a part of the current fi^me into 
blocks and then perform a first sub-encoding step on a block. Thereafter a second sub- 
encoding step is performed on the first sub-encoded block whereby the second sub- 
encoding step is optimised by adapting its encoding parameters based on a quantity of 
the first sub-encoded part of the current frame. The quantity is determined by 
prediction from a reference frame. Then the same steps are performed on another 
block of the part of the current frame. 

Typically, the framed data will be video frames for transmission over a transmission 
chaimeL The adaptation of the parameters for the second sub-encoding step may be 
made dependent upon the characteristics or limitations, e.g. bandwidth limitation, of 
the channel. In addition, the current frame may be discarded based on the predicted 
quantity and/or based on fullness of a buffer. 


Fig. 8B 
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